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Correspondence 


The  Channel  Capacity  of  a  Certain  Noisy  Timing 
Channel 

Ira  S.  Moskowitz,  Member,  IEEE ,  and  Allen  R.  Miller 

Abstract — The  effect  of  noise  upon  a  simple  covert  timing  channel  is 
investigated.  Shannon’s  information  theory  is  used  to  quantify  the 
resulting  information  flow  across  the  channel.  In  particular,  how  a 
probabilistic  response  time  to  a  query  by  the  receiver  affects  the  mutual 
information  and  channel  capacity  is  studied.  The  channel  capacity  is 
expressed  in  terms  of  the  critical  probability  for  the  mutual  information 
function  which  is  given  in  closed  form  in  terms  of  Wright’s  hypergeo¬ 
metric  function. 

Index  Terms — Channel  capacity,  covert  channel,  special  functions. 

I.  Introduction 

We  consider  an  «-user  computer  system,  n  >  2,  where  there  are 
two  users  designated  high  and  low.  We  assume  that  certain  proce¬ 
dures  have  been  set  up  so  that  low  may  not  read  high's  files  and 
high  may  not  write  its  files  to  low.  These  are  the  no  read  up,  no 
write  down  requirements  of  the  Bell-LaPadula  model  [1],  How¬ 
ever,  it  may  be  possible  for  high  to  covertly  pass  information  to  low 
over  a  communication  channel  that  unintentionally,  with  respect  to 
the  system  design,  exists  in  the  computer  system.  Such  a  means  of 
communication  is  referred  to  as  a  covert  channel.  We  are  interested 
in  the  case  where  it  is  possible  for  high  to  interfere  with  the  system 
response  time  to  low’s  input.  Wc  will  only  be  concerned  with  delays 
to  low’s  input  of  a  specific  query  designated  by  q. 

In  this  correspondence,  we  do  not  propose  methods  of  detecting 
timing  channels  or  of  giving  specifications  which  prevent  covert 
channels  [2] -[4].  Instead,  we  continue  in  the  spirit  of  Millen  [5]  by 
giving  methods  for  quantifying  the  capacity  of  timing  channels.  In 
fact,  the  first  systematic  capacity  analysis  of  timing  channels  can  be 
found  in  Huskamp’s  dissertation  [6].  The  measurement  of  capacity 
is  necessary  for  certain  levels  of  “Orange  Book’’  certification  [7], 
which  is  of  great  importance  to  designers  of  secure  systems.  We 
present  an  idealized  situation  that  we  hope  will  lead  to  further 
system-dependent  analysis  of  similar  situations. 

The  communication  between  high  (transmitter)  and  low  (receiver) 
previously  described  is  a  covert  timing  channel,  or  more  succinctly, 
a  timing  channel.  We  are  taking  Wray’s  [8]  definition  of  a  timing 
channel  as  a  “covert  channel  whose  alphabet  is  constructed  from 
different  time  values.”  In  [5],  Millen  discusses  a  simple  timing 
channel  where  a  reply  takes  one  tick  (normalized  time  unit)  if  high 
is  not  interfering  with  low,  and  two  ticks  if  high  is  interfering.  One 
tick  tells  the  low  user  in  Millen’s  scheme  to  interpret  the  message  as 
the  binary  number  0  and  two  ticks  as  the  binary  number  1.  (We  use 
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bold  face  characters  for  the  binary  numbers  to  avoid  confusion 
later.)  Millen  restricted  his  investigations  to  noiseless  channels.  In 
this  correspondence,  we  obtain  Millen’s  result  as  a  special  case. 

The  noise  that  we  will  be  studying  will  not  affect  the  value  of  the 
output.  The  noise  will  only  affect  the  timing  of  the  output,  unlike  in 
[4],  where  the  timing  was  irrelevant,  but  the  symbols  being  passed 
were  the  important  feature.  The  noise  effects  in  our  model  are 
envisioned  as  being  due  to  time  sharing  delays  of  the  CPU  and  I/O 
caused  by  many  users  contending  for  computing  resources.  We  will 
refer  to  this  as  contention.  Of  course,  it  is  the  contention  that  causes 
the  noise. 

The  users  have  an  a  priori  knowledge  only  of  the  arrival  times  of 
the  response  to  the  query  q  in  the  probabilistic  sense  given  in 
Section  II.  This  probabilistic  arrival  time  is  the  effect  from  the  noise 
in  our  system.  A  strategy  must  be  developed  that  exploits  this 
knowledge  if  high  and  low  are  to  communicate  in  an  efficient 
manner. 

II.  Mathematical  Assumptions  and  Definitions 

We  shall  use  a  modified  exponential  distribution  to  model  the 
uncertainty  in  arrival  times  of  signals  to  the  low  user,  thus  general¬ 
izing  the  noiseless  model  of  Millen  [5).  Suppose  that  low  does  its 
input  query  cj  at  time  zero.  In  our  noisy  system  the  output  will 
arrive  via  an  exponential  distribution  starting  one  tick  after  q.  If 
high  is  interfering  with  low,  then  the  output  will  arrive  via  an 
exponential  distribution  starting  two  ticks  after  q.  Again  we  are 
assuming  that  the  responses  to  q  are  identical.  It  is  the  times  at 
which  responses  arrive  that  are  different.  Thus,  we  formalize  these 
ideas  with  the  following  assumptions. 

If  high  is  not  interfering  with  low,  then  the  response  time  to  q, 
inputted  at  time  zero,  is  given  by  the  random  variable  X,  with 
probability  density  function 

ID,  otherwise, 

and  if  high  is  interfering  with  low,  then  the  response  time  to  q  is 
given  by  the  random  variable  X2  with  probability  density  func¬ 
tion 

1  U ,  otherwise . 

We  model  X  as  being  inversely  related  to  the  contention.  The 
parameter  X  can  be  adjusted  to  demonstrate  different  scenarios  with 
regard  to  users  contending  for  resources.  Since  the  expectation  of 
Xj  is  1  4-  1/X,  one  could  estimate  X  by  using  system  performance 
statistics  related  to  mean  response  time.  By  letting  X  ->  on  we  obtain 
the  same  situation  that  Millen  set  up.  Later  we  will  show  how  the 
channel  matrix  gives  the  exact  relationship  between  noise  and  X. 

The  lower  X,  the  lower  the  capacity;  this  is  nothing  new,  noise 
reduces  mutual  information.  Say,  however,  that  we  wish  to  allow 
timing  channels  that  have  a  certain  capacity.  Thus  it  may  be  possible 
to  measure  the  parameter  X,  and  if  X  is  too  large,  then  the  computer 
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itself  could  start  up  background  processes  to  lower  X  so  that  the 
capacity  falls  within  an  acceptable  region.  Without  a  way  of  quanti¬ 
fying  the  capacity,  this  could  not  be  done  effectively.  This  would 
allow  a  system  to  operate  at  a  high  level  of  efficiency  and  still  stay 
within  security  guidelines. 

Let  k  represent  the  time  that  the  output  (response  signal)  arrives 
after  q  is  inputted.  Without  any  restrictions  we  have  that  l<r< 
0 which  can  lead  to  a  situation  where  low  has  an  infinite  wait  for 
an  output  to  q.  Further,  let  K  be  the  random  variable  correspond¬ 
ing  to  k  .  The  distribution  for  K  is  obtained  by  conditioning  on 
whether  high  is  interfering  (denoted  by  Int)  or  not  interfering 
(denoted  by  Nolnt)  with  low’s  response  to  q: 

P(K  st)  =  P(Ks  t  |  Nolnt)  P{ Nolnt) 

+  P(K  s  t  |  Int)P(lnt). 

Notice  that  the  conditional  probability  P(K  s  t  |  Nolnt)  is  just 
P(Xt  <  /)  and  P(K  <  t  |  Int)  =  P(X2  <  t).  High  will  interfere 
with  low  depending  on  whether  high  wishes  to  send  a  0  or  a  1  to 
low.  We  assign  a  probability  of  p  whenever  high  sends  a  0 
(NoInt)\  therefore  the  probability  that  high  will  send  a  1  (Int)  is 
1  -  p.  Thus, 

P(K  st)  =  P(X1  <  t)p  +  P(X2  <  /)(1  -  p ). 

Of  course  the  way  things  stand  now,  high  must  have  some  feedback 
in  order  to  know  whether  or  not  low  received  the  output.  Because  of 
the  probabilistic  nature  of  the  response  time  to  q  we  have  an 
unbounded  possible  response  time.  Thus,  we  must  make  some 
adjustments  in  the  strategy  so  that  a  feasible  and  realistic  communi¬ 
cation  channel  is  set  up  between  high  and  low.  We  will  adopt  two 
different  but  related  strategies.  Strategy  1  is  the  simpler  of  the  two, 
but  Strategy  2  is  a  more  efficient  use  of  the  covert  communication 
channel. 


for  Strategy  1.  Ideas  similar  to  the  communication  protocol  in  the 
above  strategies  are  explored  more  fully  in  the  work  of  Lee  and 
Davidson  [9]  where  they  discuss  deadlines  in  timed  synchronous 
communication. 

III.  Transmission  Errors 

There  are  obvious  transmission  errors  in  our  strategies  which 
result  in  noise.  The  results  in  this  section  and  the  next  one  hold  for 
both  strategies.  Let  X  be  the  random  variable  representing  the 
input  to  the  covert  communication  channel,  i.e.,  the  high  user,  and 
let  Y  represent  the  output  random  variable  corresponding  to  low. 
The  channel  is  a  discrete  memory  less  channel. 

Let  p(i  I  j)  be  the  probability  of  an  i  being  received  by  low 
given  that  a  j  was  sent  by  high,  where  i,  j  =  0, 1.  There  are  no 
errors  if  high  sends  a  1  since  2  s  k.  The  low  user  is  watching  its 
clock  and  as  soon  as  two  ticks  have  gone  by,  low  interprets  the 
message  as  a  1  which  is  correct.  Thus,  we  have 

E>(1|1)  =  1  and  P(0  |  1)  =  0.  (1) 

However,  if  high  wishes  to  send  a  0,  then  errors  can  be  introduced. 
If  the  output  arrives  before  two  ticks  have  elapsed  there  is  no 
transmission  error.  However,  if  because  of  contention  2  <  k  ,  then 
we  do  have  an  error  because  low  will  interpret  the  message  as  a  1 
when  it  is  in  fact  a  0.  The  probability  of  a  0  being  sent  and  a  0  being 
received  is 

P(0  |  0)  =  /  dt  =  1  -  e~x.  (2) 

•*  1 

Further,  the  probability  of  a  0  being  sent  and  a  1  being  received  is 

P(  1  |  0)  =  f°°\e-*‘-"dt  =  e-x.  (3) 

2 


Strategy  1:  Low  will  input  q  every  two  ticks.  High  will  interfere 
or  do  nothing.  If  1  <  k  <2,  then  low  will  interpret  the  message  as 
a  0.  If  two  ticks  have  gone  by  on  low’s  clock,  then  low  will 
automatically  assume  that  the  message  is  a  I  and  issue  an  interrupt 
to  its  previous  query  q  before  inputting  its  next  query  q. 

The  reason  that  low  must  issue  an  interrupt,  if  it  has  not  yet 
received  a  response  to  q.  is  to  prevent  a  response  from  “leaking” 
over  into  the  next  cycle  of  query  and  response.  Say  for  example  that 
low  inputs  q,  two  ticks  go  by  and  no  response  is  given  by  the 
system,  and  then  low  again  inputs  q.  How  is  low  to  know  when  it 
finally  does  receive  a  response  if  it  is  the  response  to  the  first  q  or 
the  second  q ?  The  issuance  of  an  interrupt  after  two  ticks  will 
prevent  this  situation.  We  assume  that  the  interrupt  stops  the  re¬ 
sponse  to  q  from  reaching  the  low  user  and  that  the  interrupt  acts 
instantaneously. 

The  problem  with  Strategy  1  is  that  every  cycle  takes  two  ticks 
and  the  high  and  low  user  are  not  making  the  most  efficient  use  of 
their  covert  communication  channel.  The  next  strategy  is  a  much 
more  efficient  use  of  the  resources  available. 

Strategy  2:  Low  will  input  q  as  soon  as  it  has  received  its 
response  from  the  previous  query  provided  that  a  response  comes  in 
less  than  two  ticks.  If,  after  two  ticks,  no  response  has  arrived  at 
low,  then  low  will  automatically  issue  an  interrupt  to  its  previous 
query  q  and  issue  its  next  query  q.  If  1  <  k  <  2,  then  low  will 
interpret  the  message  as  a  0.  If  two  ticks  have  gone  by  on  low’s 
clock,  then  low  will  automatically  assume  that  the  message  is  a  1. 

We  are  assuming  that  there  is  no  time  lag  in  low  deciding,  when 
necessary,  to  input  q.  and  that  the  interrupts  behave  as  described 


The  channel  capacity  of  the  covert  timing  channel  will  be  calculated 
in  Section  IV  by  using  (1),  (2),  and  (3). 


IV.  Capacity  Analysis  of  Strategy  1 

For  now  we  are  only  trying  to  calculate  the  information  flow  in 
units  of  bits  per  symbol.  For  Strategy  1 ,  the  difference  between  bits 
per  symbol  and  bits  per  tick  is  trivial,  i.e.,  a  factor  of  1/2. 
However,  for  Strategy  2  there  is  a  substantial  difference  and  we  will 
address  this  issue  in  Section  V. 

As  soon  as  the  low  user  inputs  its  query  q,  high  inputs  either  a  0 
or  a  1.  A0  corresponds  to  no  interference  and  a  1  corresponds  to 
interference.  The  response  to  q  is  always  the  same  for  it  is  the  time 
at  which  this  response  arrives  that  determines  the  symbol  being 
passed  over  the  channel.  If  the  response  arrives  between  one  and 
two  ticks,  but  not  equal  to  two  ticks,  Y  is  set  equal  to  0.  If  the 
response  has  not  yet  arrived  at  two  ticks,  or  arrives  at  exactly  two 
ticks,  then  Y  is  set  equal  to  1.  The  channel  matrix  from  (1),  (2), 
and  (3)  is  given  by 


/P(0|0)  P(1|0) 

1^(0  1 1)  P(l|l) 


(4) 


and  shows  how  X  influences  noise  in  the  communication  channel. 
Let  I(  X,  Y)  and  C  be  respectively  the  mutual  information  between 
X  and  Y  and  the  channel  capacity,  both  of  which  have  units  in  bits 
per  symbol.  The  mutual  information  I(X,Y)  expressed  as  a  func¬ 
tion  of  p  is  given  by 


!(p)  =  ~ P  log  p  +  e  xp  log  (e~xp) 

-(l  “  P  +  e~xp)log(l  -  p  +  e~yp),  (5) 
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where  logarithms  are  computed  using  base  2.  We  have  obtained  (5) 
by  calculating  the  mutual  information 

I{X,Y)  =H{X)  - HY{X ) 

as  the  difference  between  the  input  entropy  and  the  equivocation 
[10],  The  capacity  for  this  channel  is  the  maximum  of  7(p)  with 
respect  to  p.  Since  the  mutual  information  function  7( p)  is  concave 
down  [11,  Theorem  5.2.5]  with  respect  to  the  variable  p,  it  suffices 
to  find  the  critical  point  f  determined  by  the  equation  /'( p)  =  0. 
Thus,  from  (5),  we  have 

I'{p)  =  -log  p  +  e  xlog(e~xp) 

+  (l  -  e~x)log(l  -p  +  e~xp) ; 
so  that  the  critical  point  is  given  by 

1 

^  i  +  1)  _  e-\  ' 

Since  both  e  x  — »  0  and  X/(ex  —  1)  — ►  0  as  X  — *  oo  we  see  that 

x™  i=  1/2' 

The  capacity  is  the  mutual  information  function  evaluated  at  f ,  thus 
c(x)  =  log  i  +  e~xflog  (e_xf) 


mg  (6)  and  (7)  we  can  express  the  capacity  per  unit  time  as  the 
supremum  of  the  mutual  information  per  unit  time: 

I(X,  Y) 

c' =  7  ~E[fV =  T r''  (8) 

Note  that  the  optimizing  process  over  X  involves  I(X,  Y)  and 
£'[7']  simultaneously.  Obviously  one  would  not  want  to  code  the 
message  by  just  minimizing  time  because  we  would  lose  information 
by  not  using  enough  different  symbols. 

The  actual  distribution  of  the  query  response  random  variable  T 
is  governed  by  the  distributions  X,,  X,,  and  Strategy  2.  For  time 
values  less  than  one  tick  or  greater  than  two  ticks  the  probability 
density  function  f(t)  of  T  is  zero  since  the  response  can  never 
arrive  at  those  times.  For  time  values  greater  than  or  equal  to  one 
tick  and  strictly  less  than  two  ticks  the  behavior  of  T  is  governed  by 

/i(0- 

In  order  to  obtain  the  probability  density  function  f(t),  we 
calculate  the  derivative  of  the  associated  cumulative  distribution 
function  F(t)  =  P(T  <  t).  Hence,  F'{t)  =  /(f)  and  by  condition¬ 
ing  we  see  that 

P(T  <  /)  =  P(T<  t  |  0)P(0)  +  P(Ts  t  |  l)P(l). 

Now  if  1  <  t  <  2, 


-  (1  -  f  +  e  xf)log  (1  -  f  +  e  xf). 

The  critical  point  f  quickly  becomes  asymptotic  to  1/2.  Thus,  for 
X  >  0,  /(p)  is  nearly  optimized  for  an  input  probability  distribution 
where  both  0  s  and  l’s  are  sent  with  equal  probabilities  of  1/2. 
Numerical  calculations  [12]  show  that  C(A)  -  7(1/2)  is  small  and 
quickly  approaches  zero  as  X  -»  oo.  This  is  not  surprising  in  light  of 
a  recent  result  of  Majani  and  Rumsey  [13]  that  for  a  binary-input 
discrete  memoryless  channel,  7(1/2)  is  at  least  94.21%  of  the 
capacity. 


V .  Capacity  Analysis  for  Strategy  2 

If  we  do  a  bit  per  symbol  analysis  of  both  strategies,  they  are 
identical.  However,  if  we  do  a  bit  per  tick  analysis  they  are  quite 
different.  This  is  due  to  the  fact  that  low  will  issue  its  next  query  q 
as  soon  as  it  has  received  a  response  form  its  last  query,  provided 
that  no  more  than  two  ticks  have  elapsed  from  the  issuance  of  the 
former  query.  If  E[T]  is  the  average  time  it  takes  to  send  a  symbol 
across  the  channel,  then  the  mutual  information  of  a  discrete  memo¬ 
ryless  channel,  in  bits  per  tick,  is  defined  by 


,  _  riX-Y) 

'  E[T ] 


(6) 


Here  we  are  using  the  notational  convenience  that  the  subscript  t 
means  units  are  given  in  bits  per  tick. 

It  would  seem  natural  to  tiy  to  maximize  I,  to  get  the  actual 
channel  capacity  in  units  of  bits  per  tick.  Verdu  [14,  Theorem  2] 
studied  the  capacity  in  units  of  bits  per  unit  cost  Cu  of  a  memoryless 
(stationary)  channel.  Let  b[X  J  be  the  cost  function  associated  with 
the  input  random  variable  X .  Then  Verdu’s  theorem  states  that 


C„  = 


sup 

X 


r(x,Y) 

E[b[x]} 


(V) 


/,(i<r<f)  =  p(i<r</|  o)  7>(o)  +p(i<r<r  i)p(i) 
=  p(  1  *t)p  +  p(  1  <X<()(1  -  p) 

=  p 

J  1 

and,  since  two  ticks  is  the  cut-off  time, 

P(T  -  2)  =pj^  Xe~x,,~l)  di)  +  (1  -p)  di) 

=  e~  hp  +  1  -  p. 

Therefore,  we  have 

7>(r  >  2)  =  0, 

P(T  =  2)  =e-xp+  1  -p, 

7>(1  <  r<  t)  =  p  l'\e-Xl,-u  dti,  1  <  r  <  2, 

■'  i 

P(Ts  1)  =  0. 

Hence,  the  density  function  of  T  is  given  by 

fit)  =  5(/-2)[pe-x+  1  -p]  +  Xe^'-')pX[|  2)(r), 

where  6(  • )  is  the  Dirac  delta  function  and  Xp  2/  '  i  is  the  character¬ 
istic  (or  indicator)  function  of  the  interval  [1,2).  To  find  the 
expected  value  of  T,  since 

sin  =  r  trio*, 

J  —  OO 

we  see  that 


where  the  supremum  is  taken  over  different  probability  measures  for 
X  with  the  alphabet  of  X  fixed. 

Of  course  E[b[X]],  the  expected  value  of  b[X\,  is  given  in 
units  of  unit  cost  per  symbol.  If  the  cost  function  is  the  time  it  takes 
to  send  symbols,  then  we  can  replace  E[b[X\]  by  E[T],  Combin- 


£[r]  =  /  {t&(t  -  2)\pe  x  +  1  -  pi)  dt 

f  —  OO 

+  P  f  0  dt , 

•'i 
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and  on  performing  the  two  integrations  we  obtain 
E[T]  =  2(pe-*  +  \  -  p)  +  p{\  -  +  e-^- 


■  (9) 


Exponentiating  both  sides  of  (11)  and  setting 

y 


7  = 


-  x^yr. 


r  =  x 


(i  !)■ 


When  X  is  infinite  there  is  no  contention  and  hence  the  channel  is 
noiseless.  In  this  case,  the  channel  matrix  (4)  becomes 

'P(0|0)  5(1  jO) 

*5(0  |1)  P(  1|l)/x_ 

Further,  from  (9)  the  expectation  is  given  by 
E[T]^  =  2-p. 

When  the  contention  is  maximized,  X  is  zero.  Therefore,  the 
channel  matrix  (4)  becomes 


y  -  u 

we  obtain 

r"  +  ( xxyy)',yr  -1=0. 
We  recall  the  Wright  function  [15]  defined  by 
(a.  A)\ 


(12) 


1*1 


_  £  T{a  +  Ak)  zk 
=  ho  r(jS  +  Bk)  Id. 


"  03,5); 

and  the  definition  of  the  Pochhammer  symbol 


(*)„- 


5(0  |  0)  P(1|0)‘ 

5(0  |  1)  5(l|l)/x_0 


(S  !)• 


r(X+  n) 

r(x) 


Thus,  we  see  that  low  cannot  infer  at  all  whether  high  sent  a  0  or  a 
1.  In  fact,  low  will  only  receive  the  symbol  1.  Applying  L’Hopital’s 
rule  twice  shows  that  the  last  term  of  (9)  is  3/2  as  X  -*  0,  so  that 

E[T]^0  =  2. 

We  now  express  the  mutual  information  in  terms  of  bits  per  tick. 
From  (5),  (6),  and  (9)  we  get 


where  T(z)  is  the  Gamma  function  and  n  is  an  integer.  For 
conciseness  in  what  follows  we  shall  write  ♦  for  . 

In  [16],  Miller  showed  that  Mellin’s  result  [17]  concerning  the 
roots  of  trinomial  equations  could  be  extended  to  include  positive 
non-integer  exponents  as  well.  In  particular,  we  have  the  following. 

For  (j  >  1 ,  the  unique  positive  root  of  the  transcendental  equation 

1“  +  Mi  -  1  =  0 

is  given  by 


-p  log  p  +  e  Kp  log  (e  xp) 


I,{P) 


2 (pe  x  +  1  -  p)  +  p(\  -  e>“x)  -  + 

A 


■  P  +  e  Kp) 

1 

f  =  —  T 

U1 

\  0J  CO  )  ’ 

ex-2\ 

l 1  1  \ 

+  x 

-  +  i,-  -  l  ; 

ex  -  1  / 

\  0)  OO  ) 

( 10)  provided  that 


and  from  (8)  the  channel  capacity  as  a  function  of  X  is  given  by 

P  log  P  +  e~xp\og(exp) 

-  (l  -  P  +  e~xp)  log  (l  -  p  +  e~xp) 


I  mI  <  «>/(“  -  i) 


!-•/<» 


(13) 


C,(X)  =  sup 


2(pe-x+  1  -p)  +p(\  -  e  x)(^  +  L— j 


When  X  is  equal  to  zero  or  oo,  define  C,(X)  by  its  limiting  values. 
Thus,  C,(X)  is  a  continuous  function  for  Xe[0,  oo]  and  we  may 
write 


To  apply  this  result  to  (12)  we  must  verify  that  the  inequality  (13) 
is  satisfied.  Considering  t;  as  a  function  of  ve[0, 1],  it  is  easy  to 
show  that  4/3  <  i)  <  2.  Further,  since 


C,(X) 


and 


we  see  that 


(xxyy)i/y  <  1,  x,ye[ 0,1], 
(r,-  i^'N 


=  max 

pe[0, 1] 


-P  log  p  +  e~xp  log  (e-xp) 

-(l  -  p  +  e~xp)  log  (l  -  p  +  e  xp) 
1 

2(pe~x  +  1  -  p)  +  p(\  -  e~x)  -  + 


mm 

4/3  <7)^2 


{xxy^)'h  <  (,-  1) 


1.755  >  1, 


7, 


i  T  £  [0,  l]  • 


Therefore,  the  previous  result  may  be  applied  and  we  arrive  at  the 
following. 

The  channel  capacity  for  0  <  X  <  oo  is 


VI.  Exact  Result  for  the  Channel  Capacity  of 
Strategy  2 

Since  I,(p),  given  by  (10),  is  a  nonnegative  differentiable  func¬ 
tion  for  pe( 0,  1)  and  its  values  are  zero  at  the  boundary  of  the 
interval,  it  suffices  to  find  a  unique  critical  point,  pce( 0,  1),  for 
then  we  know  that  I,(pc )  =  C,(X). 

Taking  the  derivative  of  I,(p)  with  respect  to  p  and  setting  it 
equal  to  zero,  we  arrive  at  (after  some  algebraic  simplification) 

xln  x+  (y  -  w)ln(l  -  yp)  -  y  In  p  =  0,  (11) 


C,(X)  = 


-Pc  log  Pc  +  e  kpr  log  ( e  xpr) 

_ -  (1  -  Pc  +  e~XPo)  log  0  -  Pc  +  e~xpc) 

2(pce~k+  1  -pc)  +pc(  1  -  e“x)(^  +  ^ — 7 

X  e  —  1 


where  the  critical  probability  for  the  mutual  information  function 
I,(p ),  p  6  (0,  1),  is  given  by 


where 


Pc  = 


y  =  l  -  x, 


2u  =  1  +  y  j  In  x. 


1  1 

7  ’  7 

1  1 

-  +  1, - 1 

7  7 


yx 


(14) 
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Let  us  consider  the  two  boundary  cases.  When  X  =  0,  there  is 
infinite  noise  and  (10)  is  identically  equal  to  zero.  Therefore,  the 
channel  capacity  is  zero  and  the  critical  probability  is  of  no  concern. 
When  X  =  oo,  there  is  no  noise  which  is  the  situation  that  Millen 
studied. 

Millen  used  Shannon’s  [10]  approach  that  employed  finite-dif¬ 
ference  equations  to  show  that 

.  ,  / 1  +  A  \ 

C;(«)=log| - - - j.  (15) 

We  will  show  this  by  analyzing  the  limiting  behavior  of  (14)  as 
X  ->  oo.  In  this  case,  we  have 


egy  2  the  channel  capacity  C,  and  thejnutual  information  I, 
evaluated  at  the  limiting  value  of  ( -  1  +  A  )/2  are  quite  close  for 
all  values  of  X. 

VII.  Conclusion  and  Directions  for  Future  Research 

We  have  shown  how  to  incorporate  noise  into  the  capacity 
calculations  of  certain  timing  channels.  Strategy  1  is  rather  simplis¬ 
tic,  but  it  is  a  necessary  step  to  understanding  Strategy  2.  In 
addition.  Strategy  1  is  useful  if  there  is  no  feedback  to  high. 

In  future  work,  we  shall  relax  the  restriction  that  the  responses 
arrive  at  1  or  2  ticks  and  allow  variable  response  times.  We  can 
always  normalize  the  lesser  time  value  to  1  tick  so  we  will  investi¬ 
gate  the  situation  where  the  responses  arrive  at  1  or  (J  ticks,  0  being 
variable.  The  noise  in  the  channel  decreases  with  increasing  0  but 
the  time  required  to  send  the  symbol  1  across  the  channel  increases. 
Thus,  we  have  an  optimization  problem  for  the  capacity  with  respect 
to  0. 

At  present,  in  Strategy  2  we  allow  low  to  instantaneously  inter¬ 
rupt  its  query.  An  interesting  alternate  scenario  would  permit  some 
delay  in  the  interrupt.  Also,  the  necessity  of  an  interrupt  could  be 
mitigated  by  using  a  series  of  distinct  queries  whose  responses  are 
known  and  inputted  in  a  cyclically  repeating  order. 
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*-or| 
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(-0 
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Since  1/T(1  —  k)  vanishes  for  f>  1, 


3\  (-1)* 

2  /-*  (-1/2), 

we  arrive  at 


and 


2pc  +  1 
2 


=  iFo[ - 1/2;  -  1/4], 


where  tF0  is  a  generalized  hypergeometric  function.  Since 
tF0[n;  z]  =  (l  -  z)~° , 

we  have  that 


Pc 


-  1  +  A 
2 


and  now  after  some  algebraic  manipulation  we  deduce 


-1  +  A 


=  log 


i  +  A 


Thus,  we  have  obtained  Millen’s  result  (15)  as  a  special  case.  As  we 
did  for  Strategy  1,  numerical  calculations  also  show  that  for  Strat- 
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On  the  Capacity  Region  of  the  Discrete  Additive 
Multiple-Access  Arbitrarily  Varying  Channel 

John  A.  Gubner,  Member,  IEEE 

Abstract— The  discrete  additive  multiple-access  arbitrarily  varying 
channel  (AVC)  with  two  senders  and  one  receiver  is  considered.  Neces¬ 
sary  and  sufficient  conditions  are  given  for  its  deterministic-code  aver- 
age-probability-of-error  capacity  region  under  a  state  constraint  to  have 
a  nonempty  interior.  In  the  case  that  no  state  constraint  is  present,  the 
capacity  region  is  characterized  exactly.  In  the  case  of  the  noiseless 
mod-2  adder  AVC  using  state  constraint  function  Us)  =  s  and  subject 
to  a  state  constraint  L  less  than  or  equal  to  0.13616917,  the  capacity 
region  is  shown  to  be  a  45-degree  triangle  whose  legs  have  length 
1  -  h(L),  where  h  denotes  the  binary  entropy  function. 

Index  Terms  -  Additive  channel,  multiple-access,  arbitrarily  varying 
channel,  state  constraint,  capacity  region. 

I.  Introduction 

A  general  multiple-access  arbitrarily  varying  channel  (AVC)  with 
two  senders  and  one  receiver  is  a  transition  probability  W  from 
X  x  Y  x  S  into  Z,  where  X,  Y,  S,  and  Z  are  finite  sets,  each 
containing  at  least  two  elements.  We  interpret  W(z\x,  y,  s)  as  the 
conditional  probability  that  the  channel  output  is  z  e  Z  given  that  the 
channel  input  symbol  from  sender  1  is  xeX,  the  channel  input 
symbol  from  sender  2  is  y  e  Y ,  and  that  the  channel  state  is  seS. 
When  block  codes  of  length  n  are  used,  we  say  the  AVC  is  subject 
to  state  constraint  L  if  the  state-selection  mechanism  can  generate 
only  those  state  sequences  s  =  (s];  ■  -  ■ ,  sn)  that  satisfy  a  time-aver¬ 
age  constraint  of  the  form 

(1) 

n  k  =  i 

where  /  is  a  given  nonnegative  constraint  function  defined  on  S  and 
satisfying  mins/(.s)  =  0.  Note  that  if  L  >  max  Sl(s),  then  all  state 
sequences  s  satisfy  (1);  in  this  case  we  say  that  the  state  constraint 
is  not  present,  or  inactive. 

Definition  (Additive  A  VC):  Let  G  be  a  finite  nontrivial  commu¬ 
tative  group.  Suppose  that  X  =  Y  =  Z  =  G.  We  say  that  W  is  an 
additive  AVC  if 

W{z  |  x,  y,  s)  =  Vs( z  -  x  -  y) , 

for  some  transition  probability  V  from  S  into  G . 

General  multiple-access  AVC’s  subject  to  a  state  constraint  have 
been  studied  in  [6].  There,  both  forward  and  converse  results  were 
proved  that  enable  one  to  give  inner  and  outer  bounds  on  the 
capacity  region.  To  obtain  meaningful  inner  bounds,  one  must 
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exhibit  input  probability  distributions  for  which  certain  inequalities 
are  nonvacuous.  We  show  that  for  the  additive  AVC  such  input 
distributions  always  exist. 

In  the  absence  of  state  constraints,  we  exactly  characterize  the 
capacity  region  of  the  additive  AVC. 

In  the  special  case  of  the  noiseless  mod-2  adder  AVC  with 
l(s)  =  s  and  state  constraint  L  <  0.13616917,  the  capacity  region 
is  shown  to  be  a  45°  triangle  whose  legs  have  length  1  —  h(L), 
where  h  denotes  the  binary  entropy  function  defined  in  Theorem  3. 

Additive  AVC’s  with  one  sender  and  one  receiver  were  consid¬ 
ered  in  [4,  Section  V],  but  under  the  assumption  that  the  channel 
symbols  come  from  a  finite  subset  of  rather  than  a  finite 
commutative  group  G.  This  is  in  contrast  to  the  results  of  [4, 
Section  IV]  concerning  a  restricted  form  of  additive  AVC  called  a 
group  adder  AVC,  which  is  an  additive  AVC  for  which  S  =  G  and 
V/l)  =  fx(t  -  s)  for  some  probability  distribution  p  on  G.  In  an 
earlier  paper  [3,  Section  IV]  Csiszar  and  Narayan  analyzed  the 
single-user  noiseless  mod-2  adder  AVC. 

II.  Statement  of  Results 

In  order  to  state  our  results,  we  need  the  following  notation.  Let 
S'(S)  denote  the  set  of  probability  distributions  on  S.  For  re  9(S), 
let  rV  denote  the  distribution  on  G  defined  by  (rV)(t)  = 
Esr(s)Vs(t).  Let  H(rV)  denote  the  entropy  of  rV.  Let 

SL(S)  =  {r€®(S):  £  /(s)r(s)  <2,1. 

*  seS  > 

Note  that  if  L  >  max Sl(s),  then  9L(S)  =  ®(S).  We  now  state  our 
main  results. 

Theorem  1:  The  deterministic-code  average-probability-of-error 
capacity  region  under  state  constraint  L  of  an  additive  multiple- 
access  AVC  V  has  a  nonempty  interior,  if  and  only  if  there  is  no 
re  SL(S)  such  that  rV  is  the  uniform  distribution  on  G.  Further¬ 
more,  the  capacity  region  is  always  contained  in  the  45°  triangle, 

|(R,,R2):R,  >0,«22:0, 

and  Rt  +  R2  <  log  |G  |  -  max  //(rF)),  (2) 

reS>L(S)  ' 

where  |  G  j  denotes  the  cardinality  of  the  set  G . 

Remark:  Since  'JL(S)  is  compact  and  since  H  is  continuous, 

log  | G  |  >  max  H(rV),  (3) 

re9L{  S) 

if  and  only  if  there  is  no  re  &L(S)  such  that  rV  is  the  uniform 
distribution  on  G. 

Theorem  2:  In  the  absence  of  state  constraints,  the  capacity 
region  of  the  additive  multiple-access  AVC  V  is  always  given  by 
(2),  where  @L(S)  is  replaced  by  S(S). 

Proof:  Theorem  2  follows  from  Theorem  1,  the  preceding 
Remark,  ([7,  Theorem  1,  p.  214],  which  says  that  if  the  determin¬ 
istic-code  average-probability-of-error  capacity  region  has  a 
nonempty  interior,  then  it  is  equal  to  the  random-code  average- 
probability-of-error  capacity  region),  and  [6,  Section  IV],  which 
shows  that  the  random-code  average-probability-of-error  capacity 
region  of  the  additive  AVC  is  given  by  (2).  We  give  an  independent 
proof  in  Section  V.  □ 
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